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•^T ■ Abstract. In contrast to the usual Lipschitz seminorms associated to ordinary metrics on 

/""S , compact spaces, we show by examples that Lipschitz seminorms on possibly non-commutative 

compact spaces are usually not determined by the restriction of the metric they define on the 
state space, to the extreme points of the state space. We characterize the Lipschitz norms 
which are determined by their metric on the whole state space as being those which are lower 
semicontinuous. We show that their domain of Lipschitz elements can be enlarged so as to 
form a dual Banach space, which generalizes the situation for ordinary Lipschitz seminorms. 
We give a characterization of the metrics on state spaces which come from Lipschitz semi- 
norms. The natural (broader) setting for these results is provided by the "function spaces" 
of Kadison. A variety of methods for constructing Lipschitz seminorms is indicated. 

in 

o 

In non-commutative geometry (based on C*-algebras), the natural way to specify a 
metric is by means of a suitable "Lipschitz seminorm" . This idea was first suggested by 
Connes [CI] and developed further in [C2, C3]. Connes pointed out [CI, C2] that from 
a Lipschitz seminorm one obtains in a simple way an ordinary metric on the state space 
of the C* -algebra. This metric generalizes the Monge-Kantorovich metric on probability 
measures [KA, Ra, RR]. In this article we make more precise the relationship between 
^ ■ metrics on the state space and Lipschitz seminorms. 

Let p be an ordinary metric on a compact space X. The Lipschitz seminorm, L p , 
determined by p is defined on functions / on X by 

(0-1) L p (f) = sup{|/(x) - f(y)\/p(x, y):x + y). 
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(This can take value +00.) It is known that one can recover p from L p by the relationship 

p(x,y) = mj>{\f(x)-f(y)\:L p (f)<l}. 

But a slight extension of this relationship defines a metric, p, on the space S(X) of prob- 
ability measures on X, by 

(0.2) pO*, 1/) = sup{|p(/) - i/(/)| : L p (/) < 1}. 

This is the Monge-Kantorovich metric. The topology which it defines on S(X) coincides 
with the weak-* topology on S(X) coming from viewing it as the state space of the C*- 
algebra C(X). The extreme points of S(X) are identified with the points of X. On the 
extreme points p coincides with p. Thus relationship (0.1) can be viewed as saying that 
L p can be recovered just from the restriction of its metric p on S(X) to the set of extreme 
points of S(X). 

Suppose now that A is a unital C*-algebra with state space S(A), and let L be a 
Lipschitz seminorm on A. (Precise definitions are given in Section 2.) Following Connes 
[CI, C2], we define a metric, p, on S(A) by the evident analogue of (0.2). We show by 
simple finite dimensional examples determined by Dirac operators that L may well not be 
determined by the restriction of p to the extreme points of S(A). 

It is then natural to ask whether L is determined by p on all of S(A), by a formula 
analogous to (0.1). One of our main theorems (Theorem 4.1) states that the Lipschitz 
seminorms for which this is true are exactly those which are lower semicontinuous in a 
suitable sense. 

For ordinary compact metric spaces (X, p) it is known that the space of Lipschitz func- 
tions with a norm coming from the Lipschitz seminorm is the dual of a certain other 
Banach space. Another of our main theorems (Theorem 5.2) states that the same is true 
in our non-commutative setting, and we give a natural description of this predual. We also 
characterize the metrics on S(A) which come from Lipschitz seminorms (Theorem 9.11). 

We should make precise that we ultimately require that our Lipschitz seminorms be 
such that the metric on S(A) which they determine gives the weak-* topology on S(A). 
An elementary characterization of exactly when this happens was given in [Rf]. (See 
also [P].) This property obviously holds for finite-dimensional C*-algebras. It is known 
to hold in many situations for commutative C*-algebras, as well as for C*-algebras ob- 
tained by combining commutative ones with finite dimensional ones. But this property 
has not been verified for many examples beyond those. However in [Rf] this property was 
verified for some interesting infinite-dimensional non-commutative examples, such as the 
non-commutative tori, and I expect that eventually it will be found to hold in a wide 
variety of situations. 

Actually, we will see below that the natural setting for our study is the broader one of 
order- unit spaces. The theory of these spaces was launched by Kadison in his memoire 
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[Kl]. For this reason it is especially appropriate to dedicate this article to him. (In [K2] 
Kadison uses the terminology "function systems", but we will follow [Al] in using the 
terminology "order-unit space" as being a bit more descriptive of these objects.) 

On the other hand, most of the interesting constructions currently in view of Lipschitz 
seminorms on non-commutative C*-algebras, such as those from Dirac operators, or those 
in [Rf], also provide in a natural way seminorms on all the matrix algebras over the 
algebras. Thus it is likely that "matrix Lipschitz seminorms" in analogy with the matrix 
norms of [Ef] will eventually be of importance. But I have not yet seen how to use them 
in a significant way, and so we do not deal with them here. 

Let us mention here that a variety of metrics on the state spaces of full matrix algebras 
have been employed by the practitioners of quantum mechanics. A recent representative 
paper where many references can be found is [ZS]. We will later make a few comments 
relating some of these metrics to the considerations of the present paper. 

The last three sections of this paper will be devoted to a discussion of the great variety of 
ways in which Lipschitz seminorms can arise, even for commutative algebras. We do discuss 
here some non-commutative examples, but most of our examples are commutative. I hope 
in a later paper to discuss and apply some other important classes of non-commutative 
examples. Some of the applications which I have in mind will require extending the theory 
developed here to quotients and sub-objects. 

Finally, we should remark that while we give here considerable attention to how Dirac 
operators give metrics on state spaces, Connes has shown [C2] that Dirac operators encode 
far more than just the metric information. In particular they give extensive homological 
information. But we do not discuss this aspect here. 

I thank Nik Weaver for suggestions for improvement of the first version of this article, 
which are acknowledged more specifically below. 

1. Recollections on order-unit spaces 

We recall [Al] that an order-unit space is a real partially-ordered vector space, A, with 
a distinguished element e, the order unit, which satisfies: 

1) (Order unit property) For each «ei there is an r G M such that a < re. 

2) (Archimedean property) If a G A and if a < re for all r G M + , then a < 0. 

For any oGiwe set 

||a|| = infjr G M. + : —re < a < re}. 

We obtain in this way a norm on A. In turn, the order can be recovered from the norm, 
because 0<a<eiff||a||<l and ||e — a\\ < 1. The primary source of examples consists 
of the linear spaces of all self-adjoint elements in unital C*-algebras, with the identity 
element serving as order unit. But any linear space of bounded self-adjoint operators on 
a Hilbert space will be an order-unit space if it contains the identity operator. We expect 
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that this broader class of examples will be important for the applications of metrics on 
state spaces. 

We will not assume that A is complete for its norm. This is important for us because 
the domains of Lipschitz norms will be dense, but usually not closed, in the completion. 
(The completion is always again an order- unit space.) This also accords with the definition 
in [Al]. 

By a state of an order-unit space (A, e) we mean a continuous linear functional, fi, on 
A such that jii(e) = 1 = \\fj,\\. States are automatically positive. We denote the collection 
of all the states of A, i.e. the state space of A, by S(A). It is a w*-compact convex subset 
of the Banach space dual, A', of A. 

To each a G A we get a function, a, on S(A) defined by a(fjt) = fi(a). Then a is an 
affine function on S(A) which is continuous by the definition of the iu*-topology. The basic 
representation theorem of Kadison [Kl, K2, K3] (see Theorem II. 1.8 of [Al]) says that for 
any order-unit space the representation a — ► a is an isometric order isomorphism of A 
onto a dense subspace of the space Af(S(A)) of all continuous affine functions on S(A), 
equipped with the supremem norm and the usual order on functions (and with e clearly 
carried to the constant function 1). In particular, if A is complete, then it is isomorphic 
to all of Af(S(A)). 

Thus we can view the order-unit spaces as exactly the dense subspaces containing 1 
inside Af(K), where K is any compact convex subset of a topological vector space. This 
provides an effective view from which to see many of the properties of order-unit spaces. 
Most of our theoretical discussion will be carried out in the setting of order-unit spaces 
and Af(K), though our examples will usually involve specific C*-algebras. We let C(K) 
denote the real C*-algebra of all continuous functions on K, in which Af (K) sits as a 
closed subspace. 

It will be important for us to work on the quotient vector space A = A/ (Re). We let 
|| || ~ denote the quotient norm on A from || ||. This quotient norm is easily described. 
For a & A set 

max(a) = inf{r : a < re} 
min(a) = sup{r : re < a}, 

so that ||a|| = (max(a)) V (— min(a)). Then it is easily seen that 

||a||~ = (max(a) — min(a))/2. 



2. The radius of the state space 

Let A be an order-unit space. Since the term "Lipschitz seminorm" has somewhat wide 
but imprecise usage, we will not use this term for our main objects of precise study (which 
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we will define in Section 5). Almost the minimal requirement for a Lipschitz seminorm is 
that its null-space be exactly the scalar multiples of the order unit. We will use the term 
"Lipschitz seminorm" in this general sense. We emphasize that a Lipschitz seminorm will 
usually not be continuous for || ||. 

Let L be a Lipschitz seminorm on A. For p,v G S(A) we can define a metric, pl, on 
S(A) by 

Pl(p, v) = sup{\p(a) - v{a)\ : L(a) < 1} 

(which may be +00). Then p^ determines a topology on S(A). Eventually we want to 
require that this topology agrees with the weak-* topology. Since S(A) is weak-* compact, 
Pl must then give S(A) finite diameter. We examine this latter aspect here, in part to 
establish further notation. 

It is actually more convenient for us to work with "radius" (half the diameter), since 
this will avoid factors of 2 in various places. We would like to use the properties of order- 
unit spaces to express the radius in terms of L in a somewhat more precise way than was 
implicit in [Rf] in its more general context. The following considerations [Al] will also be 
used extensively later. 

As in [Rf] and in the previous section, we denote the quotient vector space A/ (Re) by 
A, with its quotient norm || ||~. But in addition to this norm, the quotient seminorm L 
from L is also a norm on A, since L takes value only on Re. 

The dual Banach space to A for || ||~ is just A' , the subspace of A' consisting of those 
A G A' such that A(e) = 0. We denote the norm on A' dual to || || still by || ||. The 
dual norm on A' is just the restriction of || || to A' . If we view A as a dense subspace 
of Af(K) C C(K), then by the Hahn-Banach theorem A extends (not uniquely) to C(K) 
with same norm. There we can take the Jordan decomposition into disjoint non-negative 
measures. Note that for positive measures their norm on C(K) equals their norm on A, 
since e & A. Thus we find p,, v > such that A = p — v and ||A|| = \\p\\ + \\v\\. But 
= A(e) = p(e) — z/(e) = \\p\\ — \\u\\. Consequently \\p\\ = \\v\\ = ||A||/2. Thus if ||A|| < 2 
we have \\p\\ = \\u\\ < 1. If ||A|| < 2 set t = \\p\\ < 1, and rescale p and v so that they are 
mS(A). Then 

A = tp — tv = p — (tv + (1 — t)p,). 

Now (tu + (1 — t)p) is no longer disjoint from p,, but we have obtained the following lemma, 
which will be used in a number of places. 

2.1 Lemma. The ball D2 of radius 2 about in A' coincides with {p — vip^uE S(A)}. 

Notice that if there is an a G A such that L(a) = but o ^ Me, then from this lemma 
we can find p, v G S(A) such that (p — ^)(a) 7^ 0, so that pl(p,i>) = +00. Thus our 
standing assumption that there is no such a serves to reduce the possibility of having 
infinite distances. But it does not eliminate this possibility, as seen by the example of 
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the algebra of smooth (or Lipschitz) functions of compact support on the real line, with 
constant functions adjoined, and with the usual Lipschitz seminorm. 

2.2 Proposition. With notation as earlier, the following conditions are equivalent for an 

r e K+ : 

1) For all /Li, v G S(A) we have pl{p, v) < 2r. 

2) For all a & A we have ||a||~ < rL~{a). 

Proof. Suppose that condition 1 holds. Let a G A and A G Di- Then by the lemma 
A = p — v for some p,, v G S(A). Thus 

|A(a)| = \(p-u)(a)\ < L(a)p L (p,v) < L(a)2r. 

Since A(e) = 0, thus inequality holds whenever a is replaced by a + se for s£l. Thus 
condition 2 holds. 

Conversely, suppose that condition 2 holds. Then for any p,, v G S(A) and a <E A with 
L(a) < 1 we have 

\fi(a) - v{a)\ = |0 - i/) (a) | < 2||a||~ < 2r. 

Thus plIa 4 ? v ) ^ 2r as desired. D 

Of course, we call the smallest r for which the conditions of this proposition hold the 
radius of S(A). 

We caution that just because a metric space has radius r, it does not follow that there 
is a ball of radius r which contains it, as can be seen by considering equilateral triangles 
in the plane. We remark that just because pl gives S(A) finite radius, it does not follow 
that pl gives the weak-* topology. Perhaps the simplest example arises when A is infinite 
dimensional and L(a) = ||a||~. 

3. Lower semicontinuity for Lipschitz seminorms 

Let L be any Lipschitz seminorm on an order-unit space A. (We will not at first require 
that it give S(A) finite diameter.) We would like to show that L and pl contain the 
same information, and more specifically that we can recover L from p^ as being the usual 
Lipschitz seminorm for p^. By this we mean the following. Let p be any metric on S(A), 
possibly taking value +oo. Define L p on C(S(A)) by 

(3.1) L p (f) = sup{|/(/i) - /(i/)|/ P (m, u) ■■ V ^ i/}, 

where this may take value +oo. Let Lip p = {/ : L p (f) < oo}. We can restrict L p to 
Af(S(A)). In general, few elements of Af(S(A)) will be in Lip p . However, on viewing the 
elements of A as elements of Af(S(A)), we have: 
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3.2 Lemma. Let L be a Lipschitz seminorm on A with corresponding metric pl on S(A). 
Then A C Lip p ^, and on A we have L PL < L , in the sense that L PL (a) < L(a) for all 
a G A. 

Proof. For p,, v G S(A) and a&Awe have 

\a(n) - a{y)\ = \p,(a) - u(a)\ < L(a)p L (p, v). 

□ 

For later use we remark that if L and M are Lipschitz seminorms on A and if M < L, 
then pm > p^ in the evident sense. 

We would like to recover L on A from pl by means of formula (3.1). However, the 
seminorms defined by (3.1) have an important continuity property: 

3.3 Definition. Let A be a normed vector space, and let L be a seminorm on A, except 
that we permit it to take value +oo. Then L is lower semicontinuous if for any sequence 
{a n } in A which converges in norm to a G A we have L(a) < liminf{L(a n )}. Equivalently, 
for one, hence every, tel with t > 0, the set 



C t = {aeA: L(a) < t} 



is norm-closed in A. 



3.4 Proposition. Let A be an order-unit space, and let p be any metric on S(A), possibly 
taking value +oo. Define L p on C(S(A)) by formula (3.1). Then L p is lower semicon- 
tinuous. Consequently, the restriction of L p to any subspace of C(S(A)), such as A or 
Af(S(A)), will be lower semicontinuous. 

Proof. When we view L p as a function of /, the formula (3.1) says that L p is the point- 
wise supremum of a collection of functions (labeled by pairs p, v with p, ^ is) which are 
clearly continuous on C(S(A)) for the supremum norm. But the pointwise supremum of 
continuous functions is lower semicontinuous. □ 

3.5 Example. Here is an example of a Lipschitz seminorm L whose metric can be seen 
to give S(A) the weak-* topology, but which is not lower semicontinuous. Let I = [— 1, 1], 
and let A = C l (I) , the algebra of functions which have continuous first derivatives on /. 
Define L on A by 

i(/) = ll/ , ||oo + |/ / (0)|. 

For each n let g n be the function defined by g n (t) = n\t\ for \t\ < 1/n, and g n (t) = 1 
elsewhere. Let f n (t) = j_ 1 g n (s)ds. Then the sequence {f n } converges uniformly to the 
function / given by fit) = t + 1. But L(f n ) = 1 for each n, whereas L(f) = 2. 
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A substantial supply of examples of lower semicontinuous seminomas can be obtained 
from the VF*-derivations of Weaver [W2, W3]. These derivations will in general have large 
null spaces, and the seminorms from them need not give the weak-* topology on the state 
space. But many of the specific examples of VF*-derivations which Weaver considers do in 
fact give the weak-* topology. In terms of Weaver's terminology, which we do not review 
here, we have: 

3.6 Proposition. Let M be a von Neumann algebra and let E be a normal dual operator 
M-bimodule. Let 5 : M — ► E be a W* -derivation, and denote the domain of 5 by C, so 
that C is an ultra-weakly dense unital *-subalgebra of M . Define a seminorm, L, on C 
by L(a) = \\5(ci)\\e- Then L is lower semicontinuous, and C± = {a G C : L(a) < 1} is 
norm-closed in M itself. 

Proof. Let {a n } be a sequence in C which converges in norm to b G M. To show that 
L is lower semicontinuous, it suffices to consider the case in which {a n } is contained 
in L\. Then the set {(a n ,5(a n ))} is a bounded subset of the graph of S for the norm 
max{|| \\m, II ||s}. Since the graph of a VF*-derivation is required to be ultra-weakly 
closed, and since bounded ultraweakly closed subsets are compact for the ultra-weak topol- 
ogy, there is a subnet which converges ultra- weakly to an element (c, 3(c)) of the graph of 
5. Then necessarily c = 6, so that b G £, and 5(b) is in the ultra-weak closure of {5(a n )}. 
Consequently L(b) = \\S(b)\\ < 1. □ ' 

Because of the importance of Dirac operators, it is appropriate to verify lower semicon- 
tinuity for the Lipschitz seminorms which they determine. This is close to a special case of 
Proposition 3.6, but does not require any kind of completeness, nor an algebra structure 
on A. 

3.7 Proposition. Let A be a linear subspace of bounded self-adjoint operators on a Hilbert 
space H, containing the identity operator. Let D be an essentially self-adjoint operator on 
H whose domain, V(D), is carried into itself by each element of A. Assume that [D,a] is 
a bounded operator on T>(D) for each a & A (so that [D, a] extends uniquely to a bounded 
operator on Ti). Define L on A by L(a) = \\[D, a]\\. Then L is lower semicontinuous. 

Proof. Let {a n } be a sequence in A which converges in norm to a G A. Suppose that there 
is a constant, k, such that L(a n ) < k for all n. For any (,?] 6 T)(D) with ||£|| = 1 = ||?y|| 
we have 

([£>, a% rj) = «, D V ) - (£>£, arj) = lim<[£>, a n % n). 

But \{[D, a n ]£, rj)\ < k for each n, and so \\[D, a]\\ < k. □ 

We remark that the Lipschitz seminorms constructed in [Rf] by means of actions of 
compact groups are easily seen to be lower semicontinuous. 

4. Recovering L from pl 



In this section we show that a lower semicontinuous Lipschitz seminoma L can be recov- 
ered from its metric p^. But before showing this we would like to emphasize the following 
point. Let (X, p) be an ordinary compact metric space, with A the algebra of its Lipschitz 
functions, with Lipschitz seminorm L. Then S(A) consists of the probability measures on 
X, and the points of X correspond exactly to the extreme points of S(A). The restriction 
of pl to the extreme points is exactly p. Thus when one says that one can recover L from 
the metric p, one is saying that one can recover L from the restriction of pl on S(A) to the 
extreme points of S(A). However, for the more general situation which we are considering, 
it will be false in general that we can recover L from the restriction of pl to the extreme 
points of S(A). Simple explicit examples will be given in Section 7. 

One of the main theorems of this paper is: 

4.1 Theorem. Let L be a lower semicontinuous Lipschitz seminorm on an order-unit 
space A, and let pl denote the corresponding metric on S(A), possibly taking value +00. 
Let L PL be defined by formula (3.1), but restricted to A C Af(S(A)). Then 

L PL = L. 

Theorem 4.1 is an immediate consequence of the following theorem, since we saw that 
lower semi continuity coincides with C± being norm closed. 

4.2 Theorem. Let L be any Lipschitz seminorm on an order-unit space A, and let pl de- 
note the corresponding metric on S(A). Let L PL be defined by formula (3.1), but restricted 
to A C Af(S(A)). Then {a £ A : L PL (a) < 1} coincides with the norm closure, C\, of 
C\ in A. In particular, L PL is the largest lower semicontinuous seminorm smaller than L, 
and p LpL = p L . 

Proof. (An idea leading to this proof, which is simpler than my original proof, was sug- 
gested to me by Nik Weaver.) On A' we define the seminorm, V , dual to L, by 

L'{\) = sup{|A(a)| :L(a) < 1}. 

Note that V takes value +00 on any A for which A(e) 7^ 0, and very possibly on some 
elements of A' as well. But at any rate we have the following key relationship: 

4.3 Lemma. For p, v G S(A) we have Pl(p, v) = L'(p — v). 
Proof. 

L'(p -v) = sup{|Gu - 1/) (a) I : L(a) < 1} 

= sup{|^(a) - v(a)\ : L(a) < 1} = p L (p, v). 

a 



Because £1 is already convex and balanced, the bipolar theorem [Cw] says that £1 is 
exactly the bipolar of C\. Thus we just need to show that {a G A : L PL (a) < 1} is the 
bipolar of C\. Now it is clear that the unit L'-ball in A' is exactly the polar [Cw] of C\. 
This provides the last of the following equivalences. Let a G A. Then: 

L PL (a) < 1 exactly if \fJ,(a) — v(a)\ < Pl(p>, v) for all p, v G S(A) , 
exactly if |A(a)| < L'(X) for all A G D2 (by Lemma 4.3 and Lemma 2.1), 
exactly if |A(a)| < 1 for all A G A' with L'(A) < 1, 
exactly if a is in the prepolar of {A : L'(X) < 1} (by definition [Cw]), 
exactly if a is in the bipolar of C± . 

It is clear that L PL is lower semicontinuous, that it is the largest such seminorm smaller 
than L, and that it gives the same metric. □ 

Note in particular that if L gives S(A) finite diameter, or the weak-* topology, then so 
does L PL . 

We remark that a sort of dual version of Theorem 4.1 can be found later in Theorem 
9.7. 

We have the following related considerations. Suppose again that L is a Lipschitz 
seminorm on an order-unit space A. Let A denote the completion of A for || ||, and let 
£1 denote now the closure of £1 in A rather than just in A. Let L denote the corresponding 
"Minkowski functional" on A obtained by setting, for b G A, 

L(b) = inf{r G K + : 6 G r£~i}. 

Since there may be no such r, we must allow the value +00. With this understanding, L 
will be a seminorm on A. It is easily seen that Lib) < 1 exactly if b G £\, and that L is 
lower semicontinuous because £\ is closed. 

Up to this point we did not require lower semicontinuity of L. It's import is given by: 

4.4 Proposition. Let L be a lower semicontinuous Lipschitz seminorm on an order-unit 
space A. Let L on A be defined as above. Then L is an extension of L, that is, for a E A 
we have Lia) = Lia). Furthermore, Pi = Pl- 

Proof. Suppose that a & A and L(a) = 1. Then a G C± C £\ and so clearly L(a) < 1. 
Conversely, if L(a) < 1, then a G L\. Thus there is a sequence {a n } in £1 which converges 
to a, with L(a n ) < 1 for every n. From the lower semicontinuity of L it follows that 
L(a) < 1. Finally, for p,, v G S(A) we have 

Plip, v) = sup{|/u(a) — v(a)\ : a G £1} = sup{|/u(a) — v(a)\ : a G £1} = Pl(p>, v). 

a 

Note in particular that if L gives S(A) finite diameter, or the weak-* topology, then so 
does L. However, in general L need not be a Lipschitz seminorm. For example, let A be 
the algebra of real polynomials viewed as functions on the interval [0, 2], and let L be the 
usual Lipschitz seminorm but defined using only points in [0, 1]. 
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4.5 Definition. We will call L the closure of L. We will say that a Lipschitz seminoma 
is closed if L = L (on the subspace where L is finite), or equivalently, if £1 is complete for 
the metric from || ||. 

Then Proposition 4.4 says that for most purposes we can assume that L is closed if 
convenient. 

Suppose now that L is a Lipschitz seminorm on A which is closed. On A we can define 
a new norm, ||| ||], by 

IIHII = IHI + L(a). 

It is easily verified that A is complete for this new norm. Suppose that A is a *-algebra and 
|| || is a C*-norm (this can be weakened). Suppose further that L is a closed Lipschitz 
seminorm on A which satisfies the Leibniz inequality. Then the new norm is a normed- 
algebra norm, and so A becomes a Banach algebra for the new norm. In Sections 10 and 11 
we will indicate many examples of Lipschitz seminorms satisfying the Leibniz inequality. 
This provides a rich class of examples of Banach algebras which merit study (even in the 
cases when they are commutative) along the lines considered in [BCD, J, Wl]. 

5. The pre-dual of (A, L) 

It has been shown in an increasing variety of situations that the space of Lipschitz 
functions with a suitable Lipschitz norm is isometrically isomorphic to the dual of some 
Banach space. Some of the history of this phenomenon is sketched in the notes at the 
end of chapter 2 of [Wl], or more briefly in [W2]. Within the non-commutative setting, 
Weaver shows in Proposition 2 of [W2] that the domains of M /r *-derivations (as defined 
there) are dual spaces. However, his VF*-derivations can have large null spaces, and they 
need not give the weak-* topology on S(A). Nevertheless, Weaver's approach applies to 
the non-commutative tori, and gives them the same space of Lipschitz elements as the 
approach of the present paper (when combined with [Rf]). In fact, Weaver shows in [W3] 
that for the non-commutative tori one can also define Lip Q , and that Lip a is actually the 
second dual of lip a when a < 1. 

To show within our setting that the space of Lipschitz elements is the dual of a Banach 
space, we need to assume that pl gives the weak-* topology on S(A). As before, let 
C\ = {a : L(a) < 1}. From theorem 1.8 of [Rf] we know that pl will give the weak-* 
topology on S(A) exactly if the image of C\ in A is totally bounded for || ||~. Equivalently, 
by theorem 1.9 of [Rf], L must give S(A) finite radius, and for one, hence all, iel with 
t > 0, the set 

B t = {a : L(a) < 1 and ||a|| < t} 

must be totally bounded in A for || ||. We remark that this implies that if {a n } is a 
sequence (or net) in A converging pointwise on S(A) to a G A, and if there is a constant 
k such that ||a n || < k and L(a n ) < k for all n, then a n converges to a in norm. This is 
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because {a n } is contained in kB\ whose closure in the completion A of A is compact. Let 
b be any norm limit point of {a n } in A. Then a subsequence of {a n } converges in norm 
to b. But it still converges pointwise on S(A) to a. Consequently b = a, and a is the only 
norm limit point of {a n }. 

We now have in view all the requirements on Lipschitz seminorms which we need for our 
present purposes. So we now define what we expect is the correct way to specify metrics 
on compact non-commutative spaces: 

5.1 Definition. Let A be an order-unit space. By a Lip-norm on A we mean a seminorm, 
L, on A (taking finite values) with the following properties: 

1) For a G A we have L(a) = if and only if a G Me. 

2) L is lower semicontinuous. 

3) {a E A : L(a) < 1} has image in A which is totally bounded for || ||~. 

We remark that it is easily checked that the closure (Definition 4.5) of a Lip-norm is 
again a Lip-norm. 

Within the present setting the fact that the space of Lipschitz elements is a dual Banach 
space takes the following form (which requires the Lip-norm to be closed). 

5.2 Theorem. Let A be an order-unit space, and let L be a Lip-norm on A which is 
closed. Let /C = {a G A : L(a) < 1}, so that /C is a compact (convex) set for || ||~. 
Then (A,L) is naturally isometrically isomorphic to the dual Banach space of Af (JC), the 
Banach space of continuous affine functions on /C which take value at G A, with the 
supremum norm. 

Proof. Let £i and B t be as defined as above. Because L is closed, the totally bounded 
sets Bt are complete for || ||, and so are compact. From the finite radius considerations 
of Section 2 the image of C\ in A will coincide with the image of Bt for sufficiently large 
t. Hence the image of £i in A is compact for || ||~, not just totally bounded. But the 
image of C± is exactly K, as defined in the statement of the theorem. 

We can now argue as in the proof of proposition 1 of [W4] . We include the argument 
here in a form specific to our particular situation. 

Let V = Afo(JC), as defined in the statement of the theorem. Then from lemma 4.1 
of [K3] each element of V extends to a linear functional (not necessarily continuous for 
|| ||~) on A. But we still view V as equipped with the uniform norm || H^ from C(/C), 
for which V is complete. Then for any / G V we have 

H/lloo = sup{/(a) : a G JC} = sup{/(a) : L(a) < 1}. 

Consequently || ||oo is just the dual norm to the norm L on A. But V will usually be 
much smaller than the entire dual Banach space of (A, L) because of the requirement that 
if / G V then / is continuous on /C. 
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We let V denote the dual Banach space to V. We have the evident mapping a from 
A to V defined by a(a)(f) = /(a). Use of the Hahn-Banach theorem shows that Afo(JC) 
separates the points of /C, and from this we see that a is injective. Furthermore \cr(a) (/) | = 
|/(S)| < ||/||ooi(o)) and so \\a\\ < 1 for the norm L on A. In particular, a(K) C (V)i, the 
unit ball of V. From the definitions of a and V we see immediately that a is continuous 
from /C to (V')i with its weak-* topology from V. Since /C is compact, a(/C) must be 
compact for the weak-* topology. If cr(/C) were not all of (V')i, there would be a <po G (V')i 
and a weak-* continuous linear functional separating ipo from cr(/C). But every weak-* 
linear functional comes from V. Thus there would be an / G V such that 

/(a) < 1 < po(/) 

for every a G /C. But the first inequality means that ||/||oo < 1, and so the second inequality 
means that ||<£>o|| > 1? contradicting the assumption that <po G (V')i- Thus cr(/C) = (V')i. 
Consequently cr is an isometric isomorphism of (A, L) with V' . D 

We remark that, if desired, we can make A itself into the dual of a Banach space, in a 
non-canonical way, as follows. Let r be the radius of (A, L), and let \i be any fixed state 
of A. Define an actual norm, L M , on A by 

L^(a) = max{|/i(o)|/r, L(a)}. 

Let L^ be the quotient of L M on .4.. It is clear that L^ > L. But for any given a G A we 
can find a6K such that \\a — a\\ < rL(a), by the definition of radius. Then 

|/i(a — a) | < \\a — a\\ < rL(a), 

while L(a — a) = L(a). Consequently L^(a) < L(a), so that, in fact, L^ = L. Thus (A, L M ) 
has (^4, L) as quotient space. The quotient map splits by the isometric map a h-> a — //(a). 
Since (^4, L) is isometrically isomorphic to a dual Banach space, it follows easily that 
(A, Ly) is also. 

See also section 2 of [H], which gives a slightly different approach because the norm on 
Lipp is slightly different from that implicit here. 

Let /C and V = Aj 'o(ZC) be as in the statement of Theorem 5.2. As in Section 2, the dual 
of (.4, || || ~) is A' . By the finite diameter condition and Proposition 2.2 each A G A' 
defines a continuous linear functional on (A, L) . Each such functional is clearly continuous 
on K, for its topology from || ||~. Thus each A G A' defines an element of V, and so we 
obtain a linear map from A' into V. From Theorem 5.2 the norm || H^ on V from C(/C) 
coincides with the dual norm V from (A, L) . We have the following addition to Theorem 
5.2. 
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5.3 Proposition. The image of A' in Af (JC) is dense in Af (K.) for its norm || ||oo = 
V. 

Proof. Let ip be any continuous linear functional on V which is on the image of A' . 
From Theorem 5.2 every continuous linear functional on V comes from an element of A. 
If d is the element of A corresponding to ip, we then have A(d) =0 for all A G A' , which 
implies that a = so that ip = 0. It follows from the Hahn-Banach theorem that the 
image of A' is norm dense in V. □ 



6. Extreme points 

Let L be a Lipschitz seminorm on an order-unit space A, and let p^ be the corresponding 
metric on S(A). Let E denote the set of extreme points of S(A). Then E need not be 
a closed subset of S(A), but S(A) is the closed convex hull of E by the Krein-Milman 
theorem. Of course pl restricts to a metric on E. We will give explicit examples in the 
next section to show that even when L is a Lip-norm the restriction of pl to E does not 
determine pl or L. Nevertheless, we can try to use the restriction of pl to define a new 
Lipschitz seminorm, L e , on A, by 

L e (a) = sup{|e(a) - rj(a)\/p L (e,rj) :e,neE, e^n}. 

6.1 Proposition. With the above definition, L e is a lower semicontinuous Lipschitz semi- 
norm on A, and it is the smallest such on A whose metric on S(A) agrees on E with that 
of L. If L is a Lip-norm then so is L e . 

Proof. From Theorem 4.2 it is clear that we can assume that L is lower semicontinuous. 
From Theorem 4.1 we know that any lower semicontinuous Lipschitz seminorm, say L\, is 
recovered from its metric by a supremum as above, but ranging over all of S(A) rather than 
just over E. Thus if the metric for L\ agrees with p^ on E, we must have L e < L\. By using 
the argument in the proof of Proposition 3.4 it is easily seen that L e is lower semicontinuous. 
Suppose that L e (a) = for some a G A. Recall that D 2 = {\ <E A' : ||A|| < 2}. 

6.2 Lemma. The convex hull of {e — rj : e,r] G E, e ^ rj} is dense in D 2 for the weak-* 
topology. 

Proof. From Lemma 2.1 we know that any element of D 2 can be expressed as p\, — v for 
p, v G S(A). By the Krein-Milman theorem each of p, v can be approximated arbitrary 
closely in the weak-* topology by convex combinations from E, say Yl a j e j an d J20kVk- 
But the difference of such combinations can be expressed as 



^2(ajPk)(£j-Vk)- 
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□ 

From this lemma it is clear that if L e (a) = then L(a) = 0, and thus a G Me. Also, it 
is easy to see that p^ agrees with pl on E. 

Finally, we must show that if L is a Lip-norm then the image of /Co = {a : L e (a) < 1} 
in A is totally bounded for || ||~. Notice that this image is larger than that for L, so we 
can not immediately apply the corresponding fact for L. Let E denote the closure of E in 
S(A). It is clear that the supremum defining L e could just as well be taken over E, and 
so L e on A is just the Lipschitz norm for the metric pl restricted to E. Thus /Co can be 
viewed as contained in {/ G C(E) : L e (f) < 1}, and the latter has totally bounded image 
in C(E)/We since it consists of Lipschitz functions for a metric and E is compact. Thus 
/Co has totally bounded image in C(E)/We. But the restriction map from Af(S(A)) to 
C(E) is isometric for || ||oo since E contains the extreme points. (See Theorem II. 1.8 of 
[Al]. We are dealing here with Kadison's smallest separating representation.) It follows 
easily that /Co has totally bounded image in A as needed. □ 

We remark that if F is any subset of S(A) which contains E, then we can use F instead 
of E to define a Lip-norm L F just as we defined L e above. Then we will have 

L e <L F <L 

in the evident sense, with reverse inequalities for the corresponding metrics. 

Suppose that A is a dense *-subalgebra of a C*-algebra, A, and that L is a Lip-norm 
on A, with corresponding metric pl on S(A). As above let E denote the set of extreme 
points of S(A). Assume first that A is commutative. Then E is compact and A = C(E). 
Assume that L = L e . Then L is the usual Lipschitz norm coming from the metric on the 
compact set E obtained by restricting pl to E. But in this case we know that L must 
then satisfy the Leibniz rule 

L(ab) <L(a)\\b\\ + \\a\\L(b). 

It is thus natural to ask the general question: 

6.3 Question. What conditions on a Lip-norm L on a general unital C* -algebra imply 
that L satisfies the Leibniz rule? 

In the next section we will see examples of Lip-norms which do not satisfy L = L e and 
yet satisfy the Leibniz rule. 

7. Dirac operators and ordinary finite spaces 

Connes has shown [CI, C2, C3] that for a compact Riemannian (spin) manifold all the 
metric information is contained in the Dirac operator. This led him to suggest that for 
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"non-commutative spaces" , metrics should be specified by some analogue of Dirac opera- 
tors. We explore here some aspects of this suggestion for finite-dimensional commutative 
C*-algebras, i.e. ordinary finite spaces. This will clarify some of the considerations of the 
previous sections. Here and throughout all the rest of this paper, when we say that an 
operator D is a " Dirac" operator, this is not meant to indicate any particular properties 
of -D, but rather is meant to indicate how D is employed, namely to define a Lipschitz 
seminorm. 

Let X be a finite set, and let A = C(X). In order to remain fully in the setting of the 
previous sections we take C(X) to consist only of real- valued functions. But in the present 
commutative situation this is not so important because, unlike the non-commutative case, 
if one does not know the algebra structure, the norm for complex-valued functions is still 
given by a simple formula in terms of the norm for real- valued functions. (See e.g. lemma 14 
of [W2].) Consequently we will be a bit careless here about this distinction. 

We will suppose that A has been faithfully represented on a finite-dimensional complex 
Hilbert space Ti. We suppose given on Ti an operator D (the "Dirac" operator). It is 
usual to take D to be self-adjoint. But we find it slightly more convenient to take D to 
be skew-adjoint. The two choices are related by a multiplication by i, and give the same 
metric results. Following Connes, we define a seminorm, L, on A by 

L(a) = \\[D,a}\\, 

where [ , ] denotes the usual commutator of operators, and the norm is the operator norm. 
We want L to be a Lip- norm. Thus we require that if [D, a] = then a G CI. Because we 
are in a finite-dimensional setting, L is continuous for || H^, and indeed is a Lip-norm on 
A. 

From L we obtain a metric, pl, on the space S(A) of probability measures on X , as 
well as on its set of extreme points, which is identified with X itself. We now give a very 
simple example to show that pl on S(A) need not agree with the metric obtained from p^ 
onl. 

7.1 Example. Consider a three-dimensional commutative C*-algebra, A, represented 
faithfully on a three-dimensional Hilbert space. Thus we can identify A with the algebra 
of diagonal matrices in the full matrix algebra Mg, = Ms(C). We will consider Dirac 
operators of a special form which facilitates calculation, namely matrices D in M^(C) of 
the form 

/ a 
D= (3 

\-a -(3 

where a > and (3 > 0. We will also restrict to those / G A which are real, and denote 
the three values (or diagonal entries) of / by (/i,/2,/s). Because D is skew-symmetric, 
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[D, f] is a real symmetric matrix, whose eigenvalues thus are real. In fact, we have 

/ a(/ 3 -/i) 

[DJ]=[ p(f 3 - f 2 ) 

\a(/3-/i) /3(/3-/ 2 ) 

Because of this special form, the eigenvalues are easily calculated, and one finds that 

L(f) = \\[D, f]\\ = {a\h ~ A) 2 + /3 2 (/ 3 - / 2 ) 2 ) 1/2 - 

It is clear from this that if L(f) = then / is a constant function. Thus L defines a 
Lip-norm on A. 

We now proceed to calculate the corresponding metric on S(A). We first calculate the 
dual norm, L', on A' , the dual space of A, with notation as in the previous sections. 
We identify A' with real diagonal matrices of trace 0, paired with A via the trace. For 
A G A' we denote its components by A = (Ai, A2, A3). Of course 

L / (A)=sup{|(/,A)|:L(/)<l}. 

Now both |(/, A) I and L(f) are unchanged if we add a constant function to /. Thus for 
the supremum defining L'(X) we can assume that /3 = always. Furthermore, we know 
that A3 = — (Ai + A2). Thus we need only deal with the first two components of / and A. 
We do this without changing notation. Then we see that 

L'(\) = supfl/iAi + / 2 A 2 | : a 2 f 2 + /3 2 /| < !}• 

But this is just the norm of a functional on a suitable Hilbert space. Specifically, let 
l 2 {w) be the Hilbert space of functions on a 2-point space with weight function w given by 
(a 2 ,/? 2 ). Then 

/1A1 + f 2 \ 2 = fi(Xi/a 2 )a 2 + / 2 (A 2 //? 2 )/3 2 , 

and in this form the norm of the functional is the length of the vector in l 2 (w) defining it. 
This gives 

L'(A) = ((A 1 /a 2 )V + (A 2 //3 2 ) 2 /3 2 ) 1 /2 
= (A 2 /a 2 + A 2 //3 2 ) 1 / 2 . 

We now apply this to obtain the metric on S(A). If //, v G S(A), then for the evident 
notation 

p L (ji, v) = L'(fi -v) = ((/ii - ^f/a 2 + (jii - is 2 ) 2 /P 2 ) 1/2 - 
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Let X denote the maximal ideal space of A. We identify its 3 points with the 3 extreme 
points of S(A), and label them, corresponding to the coordinates in A, by Si, S 2 , S3. Then 
from the above formula for p^ we find that the metric on X is given by: 

p L (S 1 ,S 2 ) = (l/a 2 + l/[3 2 ) 1 / 2 
Pl(Si,S s ) = 1/a 
Pl(S 2 ,S 3 ) = 1/(1. 

Define 7 by Pl{Si,S 2 ) — 1/7- Let L e denote the ordinary Lipschitz norm on A coming 
from this metric on X . Then 

L e (/) = msx{\fi - / 2 | 7 , |/i - h\a, \f 2 - f 3 \(3}. 

Clearly L e is quite different from L. From Theorem 4.1 we know that the metrics on S(A) 
will thus be quite different, even though they agree on the extreme points. This is, of 
course, also easily seen by direct calculations. 

We now make some observations in preparation for the next section. It is well-known 
[Wl, W2] that the Lipschitz seminorms L = L p from ordinary metrics on a metric space 
X have a nice relation to the lattice structure of (real- valued) C(X), namely 

L(fVg)<L(f)VL(g). 

We remark that for the L of the above example this inequality fails. For instance, with 
notation as above, let / = (1, 0, 0) and g = (0, 1, 0), so that / V g = (1, 1, 0). Then we see 
that 

L(f) = a, L(g) = (3, while L(f V g) = (a 2 + (3 2 ) 1 / 2 . 

(This is related to the counterexample following theorem 16 of [W2].) 

However, it is not difficult to check that the above L does satisfy the weaker inequality 

L(/V0)<L(/). 

In fact, one can prove that this holds for any choice of skew-adjoint D for the above A. 
To find a counterexample for this weaker inequality one must take A to be 4-dimensional. 
I have not found a systematic way of constructing a counterexample there, but some 
examination of what is needed, followed by some experimentation with MATLAB yields 
the following (and related) example: 



D 




-2 
-4 
/ 



and /= (4,2,0,-1). 

We remark that ordinary Lipschitz norms on compact metric spaces can all be easily 
obtained by means of Dirac operators. I pointed this out in a lecture in 1993, and the 
details are indicated after the proof of proposition 8 of [W2] . See also the discussion for 
graphs which we will give toward the end of Section 11. 

8. A characterization of ordinary Lipschitz seminorms 

Let X be a compact space, let p be a metric on X (giving the topology of X), and let 
L denote the corresponding ordinary Lip-norm on C(X) (permitted to take value +oo). 
As just mentioned in the last section, it is well-known [Wl, W2] and easy to prove that L 
relates nicely to the lattice structure of C(X) by means of the inequality 

L(fVg)<L(f)VL(g). 

In Weaver's more general setting of domains of VF*-derivations he proves this inequality 
for W r *-derivations of Abelian structure. (See lemma 12 of [W2].) We show here that 
the above inequality exactly characterizes the Lip-norms which are the ordinary Lipschitz 
seminorms coming from ordinary metrics on X. 

We remark that we never assume here that our Lip-norms satisfy the Leibniz inequality 
for the algebra structure, namely 

L(fg)<L(f)\\g\\ + \\f\\L(g). 

But ordinary Lipschitz seminorms do satisfy this inequality. Thus one consequence of this 
section is that the above lattice inequality implies the Leibniz inequality. On the other 
hand, the Lip-norm from any "Dirac" operator will satisfy the Leibniz inequality, but can 
easily fail to satisfy the lattice inequality, as we saw by examples in the previous section. 
Thus the lattice inequality is much stronger than the Leibniz inequality. 

However we should point out that for Dirac operators on compact spin Riemannian man- 
ifolds, in spite of their being defined by means of various partial derivatives and spinors, the 
corresponding Lip-norms do satisfy the lattice inequality. This is because Connes shows 
[CI, C2, C3] that the Lip- norms which those Dirac operators define coincide with the ordi- 
nary Lip-norms for the ordinary metrics on the manifolds determined by the Riemannian 
metrics. 

Recall that for us C(X) consists of real-valued functions. 

8.1 Theorem. Let X be a compact space, let A be a dense subspace of C(X) containing 
the constant functions, and let L be a Lip-norm on A. Let L denote the closure of L, 
viewed as defined on all of C(X) as in the discussion before Proposition 4.4, and thus 
permitted to take value +oo. Then the following conditions are equivalent: 

1. The Lip-norm L is the restriction to A of the usual Lipschitz seminorm correspond- 
ing to a metric on X (namely the metric pl). 
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2. For every f,gE C(X) we have 

L(fVg)<L(f)vL(g). 

The following lemma is somewhat parallel to lemma 13 of [W2]. For later use we state 
it in slightly greater generality than needed immediately. 

8.2 Lemma. Let A be a dense subspace of C(X) containing the constant functions, and 
closed under the finite lattice operations (i.e. if f,g G A then f V g G A). Let L be a 
Lip-norm on A which satisfies the inequality 

L{fVg)<L{f)VL{g) 

for all f,g G A. Let L be the closure of L, defined on all ofC(X), permitted to take value 
+oo. Let T be a bounded subset of A for which there is a constant, k, such that L(f) < k 
for all f GJ. Letg = sup{/ G T}. Then g G C(X) and 1(g) < k. 

Proof. Let {g a } be the net of suprema of finite subsets of T . Then {g a } is contained in 
A, and converges up to / pointwise. By the hypothesis on L we have L(g a ) < k for all a. 
Thus we have 

\g a (x) -g a (y)\ < kp L (x,y) 

for all a and all x,i/6l; that is, {g a } is equicontinuous. We can thus apply the Ascoli 
theorem [Ru] to conclude that the net {g a } has a subnet which converges uniformly. But 
the limit of this subnet must be g, and so g must be continuous. Furthermore, from the 
lower semicontinuity of L we must have L(g) < k. □ 

Proof of Theorem 8.1. As indicated above, it is basically well-known, and not hard to 
verify, that condition 1 implies condition 2. Suppose conversely that condition 2 holds. 
For any x G X let p x L be the continuous function on X defined by p x L (y) = pl(x,u)- Set 
S x = {/ G A : f(x) = 0, L(f) < 1}. Since L(f) is unchanged when a constant function is 
added to /, or when / is replaced by — /, the definition of pl can be rewritten as 

pl{y) = sup{f(y) : / G S x }. 

This means that p x L = supS x . But S x is a bounded set in A by the finite radius consid- 
erations. Thus we can apply the above lemma to conclude that L(p x L ) < 1. Suppose that 
L(p x L ) = c < 1. Then L((l/c)p x L ) = 1, and so from the definition of pi, we obtain 

(l/c)\pl(x)-pl(y)\<p L (x,y), 

for all y G X , that is, 

PL{x,y) < cp L (x,y) 
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for all y G X, which is impossible (unless X has only one point, which we now do not 
permit). Thus L(p x L ) = 1. 

Much as in Section 6, let L e denote the ordinary Lip-norm on C(X) (permitting value 
+00) corresponding to the restriction of pl as metric on X. (Recall that X is identified 
with the extreme points of S(A).) As seen in Proposition 6.1, L e < L. We now show that 
L e = L because of the inequality in the hypotheses of our theorem (and its extension in 
Lemma 8.2). Let / G C{X), and suppose that L e (f) < 1. Thus 

\f(x)-f(y)\< PL (x,y) 
for all x, y G X . In particular 

f(x) - p L (x,y) < f{y). 
For each x <E X define h x G C(X) by 

h x (y) = f(x)-p L (x,y). 

Then the above inequality says that h x < f for each x. But it is clear that h x (x) = f(x). 
Thus / = sup{h x : x G X}. Then from the considerations of the previous paragraph we 
see that L(h x ) = 1 for all x. Thus by Lemma 8.2 we have L(f) < 1. It follows that L = L e 
as desired. □ 

8.3 Corollary. Let X be a compact space, and let A be a dense subspace of C(X) which 
contains the constant functions and is closed under the finite lattice operations. Let L be 
a Lip-norm on A, and suppose that 

L{fVg)<L{f)VL{g) 

for all f,g G A. Then L is the restriction to A of the ordinary Lip-norm on C(X) 
corresponding to the metric pl on X . 

Proof. Let f,g G C(X). Then from Lemma 8.2 we see immediately that 

L(fVg)<L(f)vL(g). 

We can thus apply Theorem 8.1 to obtain the desired conclusion. □ 

One way of viewing Theorem 8.1 is that it characterizes the Lip-norms on commutative 
C*-algebras which come from the corresponding metric on the extreme points of S(A). It 
would be interesting to have a corresponding characterization for non-commutative C*- 
algebras, and for general order-unit spaces. 
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9. Lip-norms from metrics on S(A) 

It is natural to ask which metrics on S(A) arise from Lip- norms on A. We obtain here 
a characterization of such metrics. Many of the steps work for arbitrary convex sets, and 
so at first we will work in that setting. Thus we let V be any vector space over R, and we 
let K be any convex set in V which spans V . Much as above, let D2 = K — K. Note that 
not only is D2 convex, but it is also balanced, in the sense that if A G D2 and if t G [—1, 1], 
then tX G D2. To see this, note that if A G D2 then clearly —A G -D2, so we only need 
consider t > 0. But 

t{jj, — u) — jj, — {tv + (1 - t)p), 

which is in D2 by the convexity of K. Let V° = RL>2- Then V° is a vector subspace of V. 
In the setting where K = S(A) we know that V° is a proper subspace of V. Let M be a 
norm on V°. Then we can define a metric, p, on K by p(p, v) = M(p — v). We want to 
characterize the metrics which arise in this way. 

The most natural property to expect is that p be convex (in each variable), that is: 

9.1 Definition. We say that a metric p on K is convex if for every p,i/i,U2 G K and 
t G [0, 1] we have 

p(p,, tv x + (1 - t)v 2 ) < tp(p,, v{) + (1 - t)p(n, v 2 )- 

The metrics coming from norms on V° are convex because 

p - (*i/i + (1 - t)y 2 ) = t(p - 1/1) + (1 - t)(p - i/ 2 ). 

Given a metric p on K, our strategy will be to try to use p to define a norm, M, on V° 
by first defining it on D 2 . Specifically, for A G D 2 we would like to set 

M(X) = p(ji,u) 

for A = p — v with p,u G K. But we need to know that this is well-defined. That is, we 
need to know that if p,, v, p', v' G K and if p, — v — pi — v', then p(p,, v) = p{p! ', v'). This 
can be rewritten in terms of midpoints so as to appear a bit closer to considerations of 
convexity, namely, that if 

(9.2) (p + u')/2 = (p' + u)/2 

then p(p, v) = p{p! , z/). This clearly holds for the metrics coming from norms. One finds 
an attractive geometrical interpretation when one draws a picture of this relation. 
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9.3 Definition. We say that a metric p on K is midpoint-balanced if whenever equation 
(9.2) above holds, it follows that p(p,, v) = p(p,', v'). 

Let us now assume that p is midpoint-balanced. Then M on D 2 is well-defined as 
above. We wish to extend it to a norm on V°. For this to be possible we first must have 
the property that if t G R, \t\ < 1, and if A G D 2 , then M(t\) = |£|M(A). Now from the 
definition of M it is clear that M{— A) = M(A). Thus it suffices to treat the case in which 
t > 0. If A = /i — z/, then 

tX = t(p — u) = p — (tv + (1 — t)p,), 

so that by the definition of M we have M(t\) = p(p,tv + (1 — t)p). From convexity, 
p{p,,tv + (1 — t)/u) < tp(p,,v). But also £A = (fyt + (1 — t)z/) — v, which gives a similar 
inequality. Then from the triangle inequality and convexity we have 

p(/U, v) < p(p, tv + (1 - t)p) + pitv + (1 - t)jjb, v) 
< tp(p, v) + (1 - t)p(p, u) = p(p, v). 

Thus the inequalities must be equalities, and we obtain: 

9.4 Lemma. Let p be a metric on K which is convex and midpoint balanced. Define M 
on Di as above using p. Then for any p,, v E S(A) and t G [0, 1] we have 

p(p,tu+(l-t)p) =tp(p,v), 

and for any A G D 2 and t G [—1,1] we have 

M{t\) = \t\M(X). 

Next, we need that M is subadditive on D 2 - This means that if A, A' G D 2 and if 
A + A' G D 2 , then M(X + A') < M(A) + M(A). Let A = p, - u, X' = p! - v' . Then 
A + A' = (p + p') — (u + v'). Assuming that p is convex and midpoint-balanced, we obtain 
from Lemma 9.4 that 

M(A + A') =2M((A + A')/2). 

Now (A + X')/2 = (p + p')/2 - {u + i/')/2, and (p + p,')/2, {v + v')/2 G S(A). Thus 

M((A + X')/2) = p((p + p')/2, (u + u')/2), 
and we see that what we need is: 

23 



9.5 Definition. We say that a metric p on K is midpoint concave if for any p,, v, p,', v' E K 
we have 

p((ji + mO/2, {v + u')/2) < (l/2)(p( M , j/) + p(//, !/')). 

Again one finds an attractive geometrical interpretation when one draws a picture of 
this inequality. From the discussion above we now know that: 

9.6 Lemma. Let p be a metric on K which is convex, midpoint balanced, and midpoint 
concave. Define M on K as above. If A, A' G D 2 and if X + X' <E D 2 , then 

M(A + A') <M(X) + M(X'). 

9.7 Theorem. Let p be a metric on the convex subset K of V , and let V° = RD 2 = 
WL(K — K). Then there is a norm, M, on V° such that p(p,v) = M(p — v) for all 
p,,v E K, if and only if p is convex, midpoint balanced, and midpoint concave. The norm 
M is unique. 

Proof. The uniqueness is clear since V° = M(K — K). We have seen above that the 
conditions on p are necessary. We now show that they are sufficient. We let M be defined 
on D 2 = K — K as above. For any A G V° there is a t > such that tX G D 2 . We want to 
extend M to V° by setting 

M(X) =t~ 1 M(tX). 

From Lemma 9.4 it is easily seen that M is well-defined, and furthermore that M(sX) = 
\s\M(X) for all s G R and A G V°. The subadditivity of M then follows easily from Lemma 
9.6. □ 

We now want to apply the above ideas to S(A) for an order- unit space A. Note that 
the V° of just above is then the A' of earlier. We will need the following theorem, which 
does not involve the above ideas. 

9.8 Theorem. Let A be an order-unit space, and let M be a norm on A' . Define a 
metric, p, on S(A) by 

p(ji,u) = M(p- v). 

If the p-topology coincides with the weak-* topology on S(A), then 

m = (L p y 

on A'°. 

Proof. Since Lip p is a subspace of C(S(A)), we can set Al = (Lip p ) fl Af(S(A)). Note 
that Al need not be contained in A unless A is complete. Initially it is not clear how big 
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Al is. Parallel to our earlier notation, let V denote the normed space A' with norm M. 
Note that V need not be complete. Let V denote the Banach space dual of V, with dual 
norm M'. Fix any u G S(A). For any <p G V define a function, r((p), on S(A) by 

t{<p)(p) =<p(p-vo). 

Then for p, v G S(A) we have 

|r( V )0i) - r(<p)(v)\ = \<p(p - i/)| < M'(<p)M(p -„) = M'(<p)p(fi, v). 

Thus r((p) G Lipp and L p (t(<p)) < M'(<p). In particular, r(<p) is continuous on S(A) since 
p gives the weak-* topology. Furthermore it is easily seen that r((p) is affine on S(A). 
Thus r(ip) G Al- Consequently r is a norm-non-increasing linear map from (V',M') to 
(Al,L p ). Let f denote r composed with the map from Al to Al- Then it is easily seen 
that f does not depend on the choice of u . We now need: 

9.9 Lemma. Let A = Af(S(A)), the completion of A for || ||, so that Al C A. Then 
Al is dense in A. 

Proof. Since Re C Al, it suffices to show that Al is dense in A~ . Let A G Di C ^4' = 
(■4~)'. Suppose that A(^4l) = 0. Let A = p, — v with p, v G 5(^4). For any </? G V' we have 
r(<p) G ^4 l , so 

= \(r(<p)) = p(r(ip)) - v(r(<p)) = p(p ~ vo) - <p(y ~ vo) = ¥»(A). 

Since this is true for all <p G V, it follows that A = 0. Since -D2 spans .4/ , an application 
of the Hahn-Banach theorem now shows that Al is dense on A. □ 

Now let / G -4^. We seek to define a linear functional, o~(f), on „4' related to the a in 
the proof of Theorem 5.2. We first try to define a on Di by 

<r(/)(A) = /0i) -/M 

where A = p — v for p, v G 5(^4). But we need to show that o~(f) is well-defined. We 
argue much as we did before Definition 9.3. If also A = p\ — v\ for p\,v\ G S(A), then 
(p + vi)/2 = (pi + v)/2. But these are elements of S(A) and so 

f({p + v l )/2) = f((p l + v)/2). 

But from the fact that / is affine it now follows that 

f(p)-f(y) = f(p 1 )-f(y 1 ). 
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Thus <j(f) is well-defined on D 2 - We now need to know that o~(f) is "linear" on D 2 . The 
proof that o~(f)(tX) = to~(f)(X) for t G [—1, 1] is similar to the proof of Lemma 9.4. The 
proof that a(f)(X + Ai) = a(f)(X) + a(f)(Xi) if A + Ai G D 2 is similar to the argument 
just before Definition 9.5. The proof that cr(f) then extends to a linear functional on A' 
is similar to the arguments in the proof of Theorem 9.7. For A = \x — v with p,, v G S(A) 
we have 

|a(/)(A)| = |/(/i) - /(i/)| < L p (f)p(p, 1/) = Lp(/)M(p - v) = L p {f)M{X). 

It follows that cr(/) G V and M'(a(f)) < L p (f). Thus cr is a norm-non-increasing linear 
map from (v4l, L p ) to (V', M'). Note that the constant functions are in the kernel of a, so 
that a determines a norm-non-increasing linear map from (Al,L p ) to (V',M'). But for 
/ G Al we have 

r(<r(/))fci) = <r{f)(n - u ) = f(p) - /( Mo ). 

Consequently f(a(f)) = f. Similarly, for <p G V and A = \x — v we have 

so that a(f((p)) = (p. Thus a and f are inverses of each other. Since they are norm-non- 
increasing, we obtain: 

9.10 Lemma. The map f is an isometric isomorphism of(V',M') onto (Al,L p ), with 
inverse a. 

We can now complete the proof of Theorem 9.8. Since Al is dense in A by Lemma 9.9, 
for any A G V we have 

(Lp)'(A) = sup{A(f {if)) : L p (r(<p)) < 1} = supMA) : M'(tp) < 1} = M(A). 

D 

Putting together the various pieces of this section, we obtain: 

9.11 Theorem. Let A be an order-unit space, and let p be a metric on S(A) which gives 
the weak-* topology. Then p comes from a Lip-norm L on A via the relation 

p(ji,v) = L'(p-u) 

if and only if p is convex, midpoint balanced, and midpoint convex. 

Nik Weaver has suggested to me the following alternative treatment of the material of 
this section. Let V, K, and V° be as at the beginning of this section. 
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9.12 Definition. We say that a metric p on K is linear if for every p, v G -ftT, every 
u E V° , and every £ G IR + such that // + tv and z/ + i> are in K we have 

p(p,, p + tv) = tpiy, v + v). 

It is easily seen that if p comes from a norm on V° then p is linear. Conversely, if p is 
linear, define a norm, M, on V° by 

M(v) = p(p,p + tv)/t 

for any p G -ftT and any £ G 1R + such that p + tv G -KT. One checks that M is well-defined 
and is indeed a norm. Furthermore, p comes from M. 

Weaver also points out that if V is a locally convex topological vector space and if K 
is compact, then for a suitable definition of p being compatible with the topology, one 
can show that when p is linear and compatible, then K is isometrically isomorphic to 
S(Af(K)) when the latter is given the metric coming from the Lipschitz seminorm on 
Af(K) coming from p. 

It is not clear that examples will come up where it is actually useful to apply the 
considerations of this section in order to obtain Lip-norms. Until such examples arise, it 
will not be clear whether my version or Weaver's will be the more useful. 

10. Musings on metrics 

Since the theory in the previous sections worked for order-unit spaces, which need not 
be algebras, the Leibniz inequality played no significant role there. Indeed, even when 
one has an algebra, I have not seen how to make effective use of the Leibniz inequality. 
Nevertheless, most constructions of Lipschitz seminorms which I have seen in the literature 
seem to provide ones which do satisfy the Leibniz inequality. We will briefly explore here 
a variety of such constructions, and the relationships between them. Our interest will be 
on seeing general patterns, and we will not try to deal carefully with the many technical 
issues which arise. Thus we will be less precise than in the previous sections. 

A very natural way to look for Lipschitz seminorms, closely related to Weaver's W*- 
derivations [W2], goes as follows. Let A be a unital algebra and let (Cl, d) be a first-order 
differential calculus for A. Thus Cl (which is also often denoted O 1 ) is an „4-„4-bimodule, 
and d is an O-valued derivation on A, that is, a linear map from A into Cl which satisfies 
the Leibniz identity 

d(ab) = (da)b + a(db). 

We do not require that the range of d generates Cl. Suppose now that A is in fact a normed 
algebra, and that we have a bimodule norm, N, on Cl (for the norm || || on A), that is, a 
norm such that 

N(auob) < \\a\\N(uj) 
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for a, 5 G A and wgO. Define a seminorm L on O by 

L(a) =N(da). 

It is easily seen that L satisfies the Leibniz inequality. Since dl = 0, we have L(l) = 0. Of 
course, without further hypotheses the null-space of L may be much bigger. (We should 
mention that not all seminorms satisfying the Leibniz inequality can be constructed in this 
way — see the discussion in [BC].) 

There is a universal first-order differential calculus for any unital algebra A [Ar, C2]. 
We approach this in a way which emphasizes more than usual those differential calculi 
which are inner, since at least conceptually that is what Dirac operators give, as we will 
see shortly. We form the algebraic tensor product 

with bimodule structure defined as usual by a(b <8> c)d = ab ® cd. We define d by 

da — l®a — a<g) 1. 

10.1 Definition. A first-order calculus (Q, d) is inner if there is a uq G O such that 

da = u)oa — auoo- 

Then the calculus (0", d) defined above is inner, with uq = 1 <g) 1. Note that here ujq 
may not be in the sub-bimodule generated by the range of d. This is an indication of why 
we do not require this generation property. It is simple to verify: 

10.2 Proposition. The inner first- order calculus (O^ 4 , d, 1 <S> 1) is universal among inner 
first-order differential calculi over A, in the sense that if (fi ; , d', iv' ) is any other inner 
first-order differential calculus, then there is a bimodule homomorphism $ : Q™ — > Q' such 
that <&(da) = d'a and $(1 ®1) = u' Q . In particular, 

$(a(g)6) = aco' b 

for a, 6 G A. If O' is generated by u' as bimodule, then $ is surjective, so that Q' is a 
quotient of Qf. 

10.3 Proposition. Any first- order differential calculus is contained in an inner first- order 
calculus. 

Proof. Let (O, d) be a first-order calculus. Set = 0©^4 as left ^4-module, set da = do©0, 
and set a>o = © 1. We must extend the right action of A on O to a right action on such 
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that da = u^a—au^. Thus it is clear that we must set (0©l)o = u>oo = daQ)0+aCoo = da®a, 
and so 

(u, b)a = (iva + bda, ba). 

It is simple to check that this gives the desired structure. □ 

Now let Q u denote the sub-bimodule of fi™ generated by the range of d, and so spanned 
by elements of the form 

adb — a®b — ab®l. 

Let (fi', d') be a first-order differential calculus which is not inner. Expand it to an in- 
ner calculus by the construction of the previous proposition, and then restrict $ of that 
proposition to Q u . It is clear from the construction that $ will carry Q u into fi', where O' 
is viewed as a sub-bimodule of its expansion. We obtain in this way: 

10.4 Proposition. The calculus (f2 u , d) is universal among all first- order differential cal- 
culi over A, in the sense that if (fi', d') is any other first-order differential calculus, then 
there is a bimodule homomorphism $ : Q u — ► O such that <&(da) = d'a. If Q' is generated 
by the range of d! as bimodule, then $ is surjective, so that O' is a quotient of Vt u . 

We notice that if (O, d) is any first-order differential calculus and if M is any sub- 
bimodule of fi, then we obtain a calculus (Q/J\f, d') where d! is the composition of d with 
the canonical projection of O onto Q./M- However, unlike the universal calculus, there may 
now be many more elements a for which da = beyond the scalar multiples of 1. 

Let us examine briefly what the above looks like when A = C(X) for a compact space 
X. Then fii(= A® A) is naturally viewed as a dense sub-bimodule, in fact subalgebra, of 
C(X x X). The bimodule actions are, of course, 

(fF)(x,y) = f(x)F(x,y), (Ff)(x,y) = F(x,y)f(y), 

and ujq = 1 (g) 1 is the constant function 1, so that d is given by 

(df)(x,y) = f(y)-f(x). 

Then Q u is spanned by the fdg, where 

(fdg)(x,y) = f(x)(g(y)-g(x)). 

Thus the elements of Q u take value on the diagonal, A, of X x X , and consequently Q u C 
C 00 (X xI\A). In fact it is easy to see that Q u is a dense subalgebra of C 00 (X xl\ A). 
Let p be an ordinary metric on X (giving the topology of X). View p as a strictly 
positive function on X x X \ A, and let 7 = p~ x . Then 7 is a continuous function on 
IxI\A, but 7 is unbounded if X is not finite. Let C(X xI\A) denote the algebra 
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of continuous possibly-unbounded functions on X x X \ A. Then C(X xI\A) can be 
viewed as the algebra of operators affiliated with the C*-algebra C^ (X x X \ A) in the 
sense studied by Baaj [Ba] and Woronowicz [Wo]. In an evident way C{X x X \ A) is an 
„4-.4-bimodule, containing 7. 

There are now two routes which we can take. One is to consider the inner-derivation, 
dj, defined by 7. Thus 

(d 7 f)(x, y) = 7(x, y)/(y) - f{x)i{x, y) = (f(y) - f(x))/p(x, y). 

Then we can consider bimodule norms, possibly taking value +00, on C(X xI\A), as a 
way to obtain Lipschitz norms on A. The other route is to use 7 (or p) to directly define 
norms on C co {X x X \ A). For the first route the most obvious norm is the supremum 
norm, which leads to the usual definition of the Lipschitz seminorm for a metric space. 

However, we choose to explore further the second route. (But most of what we find will 
have a fairly evident reinterpretation in terms of the first route.) There is a large variety 
of ways to obtain bimodule norms on C 00 (X x X \ A). The one which gives the usual 
definition of the Lipschitz seminorm for a metric is clearly 

N(F) = || 7 F|U, 

permitted to take value +00. But here are some others. Let m be any positive (finite) 
measure on X, and assume that m x m restricted to X x X \ A has as support all of 
X x X \ A. Then one can consider all of the L p -norms for m x m. If one wants to put 
7 (or p) explicitly into the picture, one can consider the measure 7(777. x m), although this 
just represents the choice of a different measure. Note that if / is an ordinary Lipschitz 
function for p, then 'jdf is a bounded function onlxIxA, so that ||7<if || p ,mxm is finite. 
Thus the subalgebra of elements of A for which this Lipschitz seminorm is finite is dense 
in A 

To explore further possibilities, let us for simplicity assume that X is finite. Then 
Q™ = C(X x X) can be viewed as the algebra of all matrices whose entries are indexed 
by elements of X x X. The left and right actions of A on Qf can be viewed as coming 
from embedding A as the diagonal matrices and using left and right matrix multiplication. 
Then uo is the matrix with a 1 in each entry. On A we keep the supremum norm, but on 
the matrix algebra Qf we can consider any AAbimodule norm. Let B denote Qf viewed 
as matrix algebra, and equipped with the usual C*-algebra norm. View O" as a £>-£>- 
bimodule in the evident way. Then we can consider £>-£>-bimodule norms on O". Any such 
will in particular be an A-A-bimodule norm. But there has been extensive study of the 
possible £>-£>-bimodule norms on O". They are commonly called "symmetric norms", and 
among the best known are the Schatten p-norms, which include the Hilbert-Schmidt norm 
and the trace norm. These have, of course, also been extensively studied for operators 
on infinite dimensional Hilbert spaces, and play a fundamental role in Connes' theory of 
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integration on non-commutative spaces. (See [C2] Chapter IV and its Appendix D. A nice 
treatment of the finite case can be found in [Bh].) From every symmetric norm we obtain 
a Lip- norm on A (since A is finite-dimensional). This does not exhaust the possibilities, as 
there is no necessity to restrict to symmetric norms in order to get ^4-^4-bimodule norms. 

All of the above discussion has been for the universal differential calculus. We get many 
more possibilities by using other differential calculi. We continue to concentrate on the 
case of A — C(X) with X compact. Now sub-„4-„4-bimodules of C(X x X), when closed 
in the supremum norm, will be ideals of C(X x X), and the quotient can be identified 
with C(W) for some closed subset W of X x X . We can restrict df to W. But some 
condition must be placed on W if we want to ensure that df \ w = only if / is a constant 
function. For this purpose it is convenient to assume, to begin with, that W contains the 
diagonal A and is symmetric about A, that is, if (x, y) G W then (y, x) G W. Given 
x G X we define the VF-neighborhood of x to be the (closed) set of those y G X such that 
(x, y) G W. By the VF-component of x we mean the smallest closed subset of X which 
contains the VF-neighborhood of each of its points. If df \w = 0, then / is constant on 
the VF-component of each point. Thus a sufficient condition under which df \w = will 
imply that / is constant, is that the VF-component of each point is all of X. If X is a 
finite set, then W \ A can be viewed as consisting of the directed edges for a graph whose 
vertices are the points of X. Then the above condition becomes the condition that this 
graph is connected in the usual sense. If X is not discrete, it is usual to require that W is 
a neighborhood of A. Then each W- neighborhood of a point will be an ordinary (closed) 
neighborhood, and so the VF-component of each point will be both closed and open. In 
particular, if X is connected it will be true that df \w = implies that / is constant. 

We remark that if W is a neighborhood of A and is symmetric about A, and if we set 
O = C(W), then the first order calculus (O, d) obtained as above is the typical degree- 
one piece of the complexes (0^,<i) used in defining the Alexander-Spanier cohomology 
of X . The higher-degree pieces are defined similarly but in terms of X n for various n. 
The Alexander-Spanier cohomology is then obtained by taking a limit of the homology 
of these complexes as W shrinks to A. Essentially this view can be seen in lemma 1.1 of 
[CM], where smooth functions on a manifold are used, and in Section 1 of [MW], where 
continuous functions are used. 

Suppose now that O = C(W) as above, but assume now for simplicity that W and 
A are disjoint (with W no longer required closed). Let d be defined by df = df\w, and 
assume that if df = then / is a scalar multiple of 1. To obtain a Lipschitz seminorm on 
A we again just need to put a bimodule norm on O. The method which is closest to the 
usual Lipschitz norm is to specify a nowhere zero function 7 on W and set 

L(f) = IM/lloo 

(on W, allowing value +00). In this context however, if we set p = 7 -1 , it no longer makes 
much sense to ask that the triangle inequality hold for p. About the most that is reasonable 
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is to ask that p, hence 7, be positive, and that 7(2;, y) = 7(2/, x) for (x, y) G W, x 7^ y. This 
is a situation which has been widely studied. Entire books [Ra, RR] have been written 
about the problem of finding the corresponding distance between two probability measure 
on X, often under the heading of "the mass transportation problem". The function p is 
then often called a "cost function". We should clarify that when p is not a metric we 
are dealing here with mass transportation "with transshipment permitted" [RR], not the 
original Monge-Kantorovich [KA] mass transportation problem, which does not permit 
transshipment, and may well not yield a metric. When transshipment is permitted and 
p is not a metric on X, the corresponding metric on S(X) is called the Kantorovich- 
Rubenstein metric [KR1, KR2]. For a fascinating survey of some recent developments 
concerning the original Monge-Kantorovich problem see [Ev]. 

When X is a finite set and W is viewed as specifying edges for a graph which has X 
as set of vertices, the cost function p is naturally interpreted as assigning lengths to the 
edges (though we will see a quite different interpretation in Section 12). Then the metric 
on X coming from L p is the usual path-length distance on the graph. There has been 
much study of how to compute this path-length distance efficiently for large graphs. We 
remark that if one prefers to have p defined on all of X x X one can simply set it equal to 
+00 on any (x, y), x 7^ y, which is not an edge. 

We remark that in the context of cost functions on compact sets there may well be 
no non-constant functions for which the Lipschitz seminorm is finite. As one example let 
X be the unit interval [0,1], and set p(x,y) = \x — y\ 2 . This is, in effect, because we 
permit transshipment -- the original Monge-Kantorovich problem is quite interesting for 
this particular cost function, as shown in [Ev]. It is just that the minimal cost of moving 
one probability measure directly to another does not then give a metric on probability 
measures, because it may be less costly to use two or more moves. 

There is a variety of other bimodule norms, such as L p -norms, which one can use for 
various differential calculi, and these give a wide variety of metrics on probability measures 
[Ra]. A particularly deep application of such norms, for the case of graphs, and involving 
explicitly Connes ideas of non-commutative metrics, appears in [Da] . (I thank Nik Weaver 
for bringing this paper to my attention.) 

Let us now discuss briefly the case in which we have A = M n , a full matrix algebra. 
As mentioned much earlier, one natural Lip-norm on A is just L = || ||~. Now A' can 
be identified by means of the normalized trace, r, with A itself, but equipped with the 
trace-norm. Then A' , as in our earlier notation, consists of the matrices with trace 0. 
Of course, S (A) is identified with the positive matrices of normalized trace 1. With this 
identification, we have 

Pl(a*, v) = trace(|/U — v\). 

This is exactly one of the metrics listed (with references) in the introduction to [ZS]. 
Another one listed there uses the Hilbert-Schmidt norm instead of the trace norm. Listed 
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also is a variety of other metrics on S(M n ) which have appeared in various applications. 
But I have not checked whether they come from Lip-norms. There has also been much 
study of the differential geometry of S(M n ) for a variety of Riemannian metrics, especially 
the "monotone metrics", which are closely related to operator monotone functions. Two 
very recent articles which contain many references to previous work on this topic are [Di, 
S] . But the emphasis of most of this work is not on the ordinary metric which a Riemannian 
metric induces on S(M n ), but rather on the differential geometric aspects. There is also 
study of the volume form which is induced, and on associated probabilistic aspects. For 
recent related study going in the direction of non-commutative entropy see [LR] . 

11. Dirac operators and differential calculi 

We continue our comments of the previous section, but here we focus on how Dirac 
operators fit into the picture. Let A be a unital *-algebra equipped with a C*-norm 
(perhaps not complete) , and let 7r be a faithful representation of A, that is, an isometric 
*-homomorphism of A into the algebra B(Tt) of bounded operators on a Hilbert space 7i. 
Let D be an essentially self-adjoint, possibly unbounded, operator on 7i, and assume that 
tt(o) carries the domain of D into itself for each a G A, and that on this domain [D, ir(a)] is 
a bounded operator, and so extends uniquely to a bounded operator on Ti. Then, following 
Connes, we set 

L(a) = \\[D,ir(a)}\\. 

As we did earlier, it is natural to require that [D, 7r(o)] = only when a is a scalar multiple 
of 1. Many important examples of this situation are now known. But in general it seems 
difficult to ascertain whether the corresponding metric on states gives the weak-* topology, 
though this has been shown for certain examples in [Rf]. See also [W2, W3, W5], where 
the sets Bt defined at the beginning of Section 3 are shown to be totally bounded, in fact 
compact, for various examples. We do not deal with this question here, but rather try 
to relate the bimodule picture to the Dirac picture. One direction is apparent. We view 
B(TL) as an A- ^.-bimodule by setting 

aTb — 7r(a)T7r(6). 

Then, although D is only affiliated with B(7i), conceptually we use the inner derivation 
which D defines, so that 

da = Dn(a) — ir(a)D — [D, Tr(a)]. 

(This, of course, is the starting point for Connes' non-commutative differential calculus 
[C2].) We then note that the operator norm on B(Tt) is an ^4-^4-bimodule norm, and so 
upon setting 

L(a) = || [D, ir(a)} \\ 
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we obtain a Lipschitz norm, which we showed to be lower semicontinuous in Proposition 
3.8. 

But suppose we are given instead some first order differential calculus (O, d) and a 
bimodule norm on O so that we obtain the corresponding Lipschitz norm L. Can we also 
obtain L from a Dirac operator? For this to be possible we must have L(a*) = L(a), 
and L must be lower semicontinuous. As mentioned earlier, L must also fit into a family 
of "matrix Lipschitz seminorms". These conditions are probably not enough in general, 
though I have not tried to find a counterexample. But the following superficial comments 
help to give some perspective. (In most of the considerations which follow the algebra 
structure on A is only used in order to get the Leibniz inequality. Thus much of what 
follows actually works for order-unit spaces.) 

We saw in Proposition 10.3 that we can extend (O, d) to obtain an inner first-order 
calculus. In analogy with this idea, suppose that we can realize O as a subspace of B(7i) 
for some Hilbert space 7i, in such a way that the norm on O is the operator norm, and the 
bimodule structure is given by two ^representations, 1T\ and 7r 2 , of A on Ti, so that 

aub = 7Ti(a)u>TT2(b) 

for a,b £ A and u> G O. Suppose further that there is a possibly-unbounded essentially 
self-adjoint operator, Do, on Tt, such that 7Ti(a) and ^(a) carry the domain of Dq into 
itself, and such that 

da = Dqti2{o) — ni(a)Do, 

which in particular must be a bounded operator. Set L(a) = \\da\\. This is not exactly 
the Dirac operator setting, but it is not difficult to convert it into that setting. To arrange 
matters so that we have only one representation, we let tv = -k\ © %2 on Ti © H and set 

Dl ~{o o 

Then we find that 

L(a) = || [£>i,ir(o)] || . 

But of course D\ is not self-adjoint. We fix this in the traditional way by again doubling 
the Hilbert space, with representation n © tt of A, and setting 



D 



Dl 
D 1 



The corresponding Lipschitz norm is L{a) V L(a*), but from the self-adjointness of D one 
can check that we actually get back L. 



Anyway, we are left with 



M 



11.1 Question. For an order- unit space A, or a *-algebra A with C*-norm, how does one 
characterize those Lip-norms on A which come from the Dirac operator construction? 

Even for finite-dimensional commutative C*-algebras it is not clear to me what the 
answer is. 

As mentioned earlier, a Dirac operator also gives seminorms on all of the matrix algebras 
over A, so that one can speak of this family as a "matrix Lipschitz norm" , in the spirit of 
[Ef] . Thus a related problem is to characterize these structures. 

Of course a given metric on S(A) may come from several fairly different Dirac operators. 
For example, suppose that we have a compact space X , and a closed neighborhood W of 
the diagonal A of X x X, together with a cost function p on W, just as in the previous 
section. As discussed there, we can use p together with the first-order calculus determined 
by W to define a Lipschitz norm on C(X). (Further hypotheses are needed for it to be a 
Lip-norm on a dense subalgebra of C(X).) Then by the procedure discussed earlier in the 
present section we can pass to a Dirac operator. But that procedure enlarged the Hilbert 
space because a first-order differential calculus usually involves two representations rather 
than one. We will now show that there is an alternative method which does not enlarge 
the Hilbert space. This is a mild generalization of my lecture comments for metric spaces 
mentioned earlier, whose details are indicated on page 274 of [W2]. As earlier, let m be 
a measure on X of full support, and consider m x m on W \ A. Form the Hilbert space 
Ti = L 2 (W \A,mxm). We consider only the representation tv of A = C(X) on H defined 
by 

(vf€)(x,y) = f(x)£(x,y). 

(This is, of course, essentially the left action on the bimodule for W.) Define an operator, 
F, on H by the flip 

(FZ)(x,y) = Z(y,x). 

Because we are using a product measure, the operator F is self-adjoint and unitary. Define 
an (unbounded) positive operator, P, on Ti by 

(P£)(x,y) = £(x,y)/p(x,y). 

Because we assume that p(x,y) = p(y,x) for all (x,y) G W, the operators F and P 
commute. We define the Dirac operator by 



D = PF. 



so that F is the phase of D and P = \D\. Informal calculation shows that for any / G C(X) 
we have 

([£>, 7T/]0(a;, y) = ((f(y) ~ f(x))/p(x, y))£(y, x), 
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so that 

L(f) = || [D, 717] || = sup{|/(y) - f(x)\/p(x, y) : (x, y) G W}. 

Of course, further hypotheses must be placed on p in order for this to give a Lip-norm. 
But the right-hand side of the above equality is the usual definition of a Lipschitz norm 
in this situation, especially in contexts such as graph theory. It will coincide with what 
one obtains in the corresponding bimodule approach. Notice that the resulting distance 
between two points x, y G X can easily be strictly smaller than p(x, y) (if (x, y) happens 
to be in W). 

For an interesting alternative (but closely related) method of obtaining the usual dis- 
tance on a graph (including infinite graphs) from a cost function, by means of Dirac 
operators, see theorem 7.2 of [Da]. Furthermore, in [Da] other very interesting and quite 
different Dirac operators associated to cost functions on graphs are discussed in some 
detail, and used to obtain improved estimates for heat kernels on graphs. They can be 
described in terms of first-order differential calculi and Laplace operators along much the 
same lines as we used in Section 10. Much of this is explicit in [Da], and we will not 
elaborate on it here. 

We should mention here that very interesting examples of Dirac operators associated 
with non-commutative variants of sub-Riemannian manifolds appear in the second example 
following axiom 4' of [C3] , and in [W5] . 

12. Resistance distance 

We conclude with an appealing class of examples which do not fit into the previous 
framework of differential calculi, and for which the Lip- norm does not satisfy the Leibniz 
identity. These examples come from graphs with "cost functions" on the edges, but now 
the graph is interpreted as an electrical circuit with resistances on the edges, whose values 
are given by the cost function. These examples have been extensively studied [DS, Kl, 
KIR, KZ] , but I have not seen earlier mention of the corresponding metric on probability 
measures which we will define here. It is not clear to me whether this metric is more than 
a curiosity. 

All of the discussion here can be carried out for infinite graphs, along the lines discussed 
extensively in [DS], but for simplicity we only discuss finite graphs here. The examples also 
have a fine alternative interpretation in terms of random walks [DS]. Our term "resistance 
distance" is taken from the title of [KIR]. 

The set-up, as indicated above, is a finite graph with set X of vertices, together with 
strictly positive real numbers r xy = r yx assigned to each (undirected) edge. We interpret 
these numbers as resistances. We assume throughout that the graph is connected. Given 
x, y G X, x 7^ y, we can imagine putting a voltage difference across x and y, adjusted so 
that one unit of current flows in at x and out at y. Then Ohm's law says that the "effective 
resistance" is equal to the required voltage difference. We denote this effective resistance 
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by p(x,y). It is, in fact, a metric on X. The only reference I know for this is [KIR, K, 
KZ] , but my friends in probability theory tell me that within the context of random walks 
rather than resistances this is well-known, even if no reference comes to mind. 

Suppose now that \i and v are general probability measures on X . Although it does 
not seem so intuitively obvious, we will see shortly that we can establish voltages on the 
points of X such that unit total current flows into the circuit, with the amount flowing 
in at each point x given by /j, x , while unit total current flows out of the circuit, with the 
amount at each point given by v (with the evident interpretation when the supports of \i 
and v are not disjoint). For the analysis of this situation it is useful to define a function, c, 
on the edges, by c xy = l/r xy . This is commonly called the "conductance". It is convenient 
to extend c to all of X x X by setting c xy = if (x, y) is not an edge (or if y — x). Let 
/ G C(X), interpreted as voltages applied to the points of X. We let df be defined as 
earlier for the universal calculus (or for the calculus corresponding to the edges). We let 
V/ denote the resulting flow inside the circuit. By Ohm's law the flow (before electrons 
were discovered) from x to y is given by 

(V/)(x, y) = (/(*) - f(y))c xy = -c(df), 

where by c(df) we mean the pointwise product of functions. Note that V/ is a function 
on directed edges, with 

(Vf)(x,y) = -(Vf)(y,x) 

(and value if (x,y) is not an edge). 

Suppose now that ui is any function on directed edges such that u(x, y) = —uj(y, x). We 
interpret uj(x,y) as giving the magnitude of a current from x to y. (To be more realistic 
we should require circulation, but we will have no need to impose this requirement.) 
To sustain this current, we will in general have to insert (or extract) current at various 
vertices. We let dw(uj)(x) denote the current which must be inserted at x. By Kirchhoff's 
laws we have 

div(a;)(x) = ^u(x,y). 
y 

Note that because u(x, y) = —uj(y, x), we will have 



E 



div(u;)(aO = 0, 



which accords with the fact that the total amount of current inserted must be 0. 

Suppose now that / G C(X) and that we set u = V/. We see from above that the 
currents which must be inserted to sustain the voltages given by / must be 

div(V/), 
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which we denote by Af. To accord with our earlier notation, we let A' denote the signed 
measures, A, on X for which (1, A) = 0. The discussion of the previous paragraph can be 
interpreted as saying that Af e A' . 

Suppose now that we are given A E A' ■ Can we find / such that Af = A? Note 
that since Al = 0, we know that / will not be unique, but rather that, as usual with 
potential functions, we can expect / to be unique only up to a constant function. To 
proceed further we must more carefully analyze the operator A in the traditional way [DS, 
K]. For / e C(X) we have 

(A/)(aO = £(V/)(z,y) 

y 

= $^(/(a0 - f(y)) c xy = f(x) Y^ c *y ~ Yl f(y) c *y 
y y y 

Let D denote the diagonal matrix with diagonal entries 

D xx / j C-xy- 
V 

If we view / as a column vector, we see that 

Af = (D-C)f. 

From the Peron-Frobenius theorem and the fact that our graph is connected, it follows 
that the kernel of A consists exactly of the constant functions. If we permit ourselves 
to confuse vector spaces a bit, we see that A is self-adjoint with respect to the standard 
inner- product on column vectors. Thus it carries the orthogonal complement, 7i, of the 
constant functions into itself, and it is invertible on H. Consequently, for every A E A' 
we can find a unique /eH such that Af = A. We will write this as / = A _1 A, where we 
view A as restricted to 7i so that it is invertible there. 

Suppose now that x and y are fixed points of X, and that A = 5 X — S y , where 5 X denotes 
the 5-measure at x. Thus we are inserting one unit of current at x and extracting it at y. 
Let / = A _1 A. According to our earlier comments, the effective resistance from x to y, 
p(x,y), is given by f(x) — f(y) = (A _1 A)(a;) — (A _1 A)(y). It is now easy to see why p is 
a metric, along the lines given in [KIR]. If z is any other point of X , let 

g = A- 1 (d x -5 z ), h = A-\5 z -5 y ). 

Clearly / = g + h, so 

p(x, y) = g(x) - g(y) + h(x) - h(y). 
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But simple considerations show that g must take its maximum and minimum values at x 
and z, so that 

g(x) - g(y) < g{%) - g{z) = p(%, z). 

Similarly h(x) — h(y) < p(x, z). The triangle inequality for p follows. 

But we are interested more generally in the effective resistance between \i and v where 
H and v are arbitrary probability measures, and it is not even clear how this should be 
defined. (It does not seem natural just to use the Monge-Kantorovich metric from p.) In 
view of our earlier considerations we should form A = \x — v, and so we need an appropriate 
norm on A' , and this should be the dual norm of a Lip-norm, say L, on C(X), probably 
defined by means of a norm on Q u . The dual norm, V , should be such that if A = 8 X — 5 y , 
then L'(X) = (A~ 1 X)(x) — (A~ 1 X)(y). But as remarked above, A _1 A takes its maximum 
and minimum values at x and y. Thus a norm which will meet this requirement is 

L'(X) = 2\\A- 1 X\\Z, 

where || ||~ is as defined in Section 1. To find L on C(X) we use the self-adjointness of 
A to calculate, for g e C(X) and any A G A' , 

(g,X) = (g,AA- 1 X) = (Ag,A- 1 X). 

The supremum over A such that 2||A _1 A||~ < 1 is the same as the supremum of 

((l/2)Ag,h) 

over h such that ||^||^, < 1. But we saw earlier that this gives just the restriction to A' 
of the dual norm for || ||oo on C(X), which is the L 1 -norm. Thus we see that we must 

set 



L(g) = (l/2)\\Ag\\ 1 = (l/2)Y l \(*9)(x)\ 



= (V2)E 



^2(9(x) - g(y))c xy 



(V2)£ 



^2,dg{x,y)c xy 



This is certainly rather different from the usual Lip-norms for metrics on finite sets. The 
above expression suggests that we define a seminorm, N, on Q u by 



N(u) 



(V2)E 

X 
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^2u(x,y)c xy 



so that we have 

L(g)=N(dg). 

Reversal of the earlier calculation shows that the dual norm is the V considered above, so 
that we obtain the desired p(p,, v). However N will not usually be a bimodule norm, so 
that we are not fully in the context of the previous sections, and L need not satisfy the 
Leibniz inequality. 

I must admit that I see no particularly natural interpretation for L(g), nor for p(p, is), 
even if we call the latter "effective resistance" . If g were interpreted as giving voltages 
on X, then L(g) would be half the sum of the absolute values of the currents inserted 
or extracted from the circuit, and thus exactly the sum of the currents inserted into the 
circuit (disregarding the currents extracted). But I do not see why it is natural to give 
g such an interpretation as voltages. If one goes back to the effective resistance between 
two points, then it is easily seen that this is equal to the energy dissipated by the circuit 
when one unit of current is inserted. This suggests using the dissipated energy in the more 
general case of arbitrary probability measures p and v. But the energy dissipated along 
any edge varies as the square of the current, and one can see by examples that this causes 
the triangle inequality to fail. One does obtain a metric if one uses the square-root of 
the dissipated energy, but this does not give the correct value for the effective resistance 
between two points. These possibilities are not far from the Lipschitz norm used right 
after lemma 4.1 of [Da] to define the metric denoted there by d^. This Lipschitz norm 
can be interpreted as the supremum over the points x of X of the square roots of the 
energy dissipations in all the edges beginning at x. Perhaps the discussion of Dirichlet 
spaces given in section 6 of [W6], or the "twisted bimodule structure" and corresponding 
differential discussed beginning on page 149 of [Me] in connection with Hudson's treatment 
of discrete flows and stochastic differential equations, could be used to shed more light on 
this. Or perhaps some of the stopping rules or mixing times considered for Markov chains, 
as discussed in [LW], are relevant. 

Finally, we remark that it would be interesting to study resistance distance in the 
continuous case, for example for thin plates of resistance metal of various shapes. 
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